How to Benchmark Embedding Models On Your Own Data

youtube
How to Benchmark Embedding Models On Your Own Data Learn how to benchmark embedding models on your own data in this course for beginners. In this course, you will learn: - The limitations of extracting text from PDF files with Python libraries and to solve that with the help of VLMs (Vision Language Models). - How to divide the extracted text into chunks that preserve context. - Generation questions for each chunk using LLMs (Large Language Models). - Use embedding models to create vector representations of the chunks and questions. - Use both open source and proprietary embedding models. - Use llama.cpp to run models in the GGUF format locally on your machine. - Perform the benchmarking of different embedding models using various metrics and statistical tests with the help of ranx. - Plot the vector representations to visualize if clusters are being formed. - Understand how to interpret the p-value that a statistical test provides. - And much more! You can find the slides, notebook, and scripts in this GitHub repository: The dataset is available here: To connect with Imad Saddik, check out his social accounts: LinkedIn: YouTube: Website: ⭐️ Course Contents ⭐️ (0:00:00) About the course (0:06:05) Introduction (0:17:58) Extracting text from PDF documents (1:01:08) Divide text into coherent chunks (1:23:10) Generate question-answer pairs from text chunks (1:38:48) Embed text chunks and questions (2:17:06) Statistical tests and metrics (3:12:01) Expanding the dataset and adding more languages (3:45:
  2026/01/12      youtube

Our Tag

最近投稿されたプログラミング学習動画

Python FastAPI Tutorial (Part 4): Pydantic Schemas - Request and Respo

python

In this Python FastAPI tutorial, we'll b...

  2026/01/12

Python FastAPI Tutorial (Part 3): Path Parameters - Validation and Err

python

In this Python FastAPI tutorial, we'll b...

  2026/01/12

How to Benchmark Embedding Models On Your Own Data

Learn how to benchmark embedding models ...

  2026/01/12

Python FastAPI Tutorial (Part 2): HTML Frontend for Your API - Jinja2

python

In this Python FastAPI tutorial, we'll b...

  2026/01/12

Python FastAPI Tutorial (Part 1): Getting Started - Web App + REST API

python

In this series of videos, we'll be learn...

  2026/01/12

Can you guess the output here?

This one can be super tricky. Can you fi...

  2026/01/12

What Does “Good Taste” in Code Really Mean?

python

Download your free Python Cheat Sheet he...

  2026/01/11

When you're learning or doing something new, get comfortable being unc

study

There's always more to learn in the tech...

  2026/01/11

Fine-Tune GPT Like a Pro With This Prompting Tool

python

Download your free Python Cheat Sheet he...

  2026/01/10

How to insert list items at a specific index in Python

python

Did you know that you can insert list it...

  2026/01/10

Boost Your Python Skills Live – Flexible 8-Week Beginner Course

python

Download your free Python Cheat Sheet he...

  2026/01/10

Something fun | Observable Flutter #78

flutter

Watch as Craig Labenz does something fun...

  2026/01/09

Coding Python With Confidence: Beginners Live Course Participants | Re

python

Download your free Python Cheat Sheet he...

  2026/01/09

Intermediate Deep Dive Information Session

A live information session to introduce ...

  2026/01/09

First developer job at age 38 with lawyer turned software engineer Zub

Today Quincy Larson interviews Zubin Pra...

  2026/01/09

What's the difference between call vs apply in JavaScript?

javascript

What's the difference between call vs ap...

  2026/01/09